Refocussing on the Text No in Text-to-speech

نویسندگان

  • Andrew Breen
  • Barry Eggleton
  • Steve Minnis
چکیده

Many Natural Language Processing applications depend crucially on the front end processes that handle the input text and transform it into a form usable by the more “sophisticated” linguistic component of the applications. Despite this crucial role, often these front end processes are considered uninteresting, yet it is not unusual for the perception of the complete application to be affected by this weakest link in the processing chain. With the recent productisation of many text to speech (TTS) systems, the performance of the TTS front end process, typically called the text normalization (TN) process, has been highlighted. This component performs sentence recognition, symbol and term expansions and word tokenisation but these tasks are not independent. For this reason, enhancing TN coverage often has adverse side-effects, especially when dealing with unrestricted text, so a crucial part of our Nuance Vocalizer 2.0 TTS system development concerns itself with comprehensive regression testing of coverage. As TTS systems are increasingly employed as part of general application suites, the TN component becomes the main interface with the controlling applications. Detailed specification of this interface is required, which lends itself to testing. Preprocessors, such as SSML transducers and email filers should ensure that no information is lost in subsuming some of the tasks that TN would normally undertake. Refocusing attention on the TN process and its testing is timely and can have important dividends.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

L2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors

This study was intended first to categorize the L2 learners in terms of their learning style preferences and second to investigate if their learning preferences are related to lexical inferencing. Moreover, strategies used for lexical inferencing and text related issues of text density and parts of speech were studied to determine their moderating effects and the best predictors of lexical infe...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

The Impact of Contextual Clue Selection on Inference

Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002